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Abstract-TAP (Trusted and Active PDU transfers) is a new 
distributed architecture and protocol for ATM networks that 
provides assured transfers to a set of privileged VPI/VCI. Hie 
distributed architecture proposes a new Extended AAL type 5 
(EAAL-5), manages the privileged connections and offers an 
improvement in the performance when network connections 
cause some cell loss by taking advantage of the idle time in the 
traffic sources to carry out the retransmissions of CPCS-PDU- 
EAAL5. Hie trusted protocol is supported by our AcTMs 
(active ATM switch) model that we have equipped with 
hardware techniques and active software to achieve our 
objectives. Several simulations (using ON/OFF sources) 
demonstrate the effectiveness of the mechanism that recovers 
the congested PDUs locally at the congested switches with better 
good put in the network. Also, the senders are alleviated of 
negative end-to-end retransmissions. The TAP is an active and 
distributed architecture in the sense ' that our protocol 
implements several active coordinated and self-collaborative 
software agents. 

I. INTRODUCTION AND RELATED WORK 

Reliability in ATM networks is provided by the Header 
Error Control (HEC) field of 8 bits in the header of the ATM 
cells and by the Cyclic Redundancy Check (CRC) in the 
Common Sublayer-Protocol Data Unit (CS-PDU). Error 
control is performed end-to-end by the terminals. The main 
problem is that a single cell loss causes a reassembly CRC 
error at AAL- 5 level, which in turn leads to a retransmission 
of a complete PDU (i.e., IP datagram). 

ATM networks experience three types of errors [1-2]: cell 
losses due to congestion in switches; corruption of data 
portions due to bit errors, and switching errors due to 
undetected corruption of the cell header. We note that 
congestion is by far the most common type of error, and here 
is where we want to improve the trusted transfers with TAP. 

Current literature describes three basic techniques to 
achieve reliability: ARQ [3 J (Automatic Repeat Request), 
FEC [4-6] (Forward Error Correction) and hybrid 
mechanisms of ARQ in combination with FEC. 

While ARQ adds latency (due to the cost of NACK) and 
implosion, FEC adds overhead and thus the redundant code 
added by this method is useless when the network is 
experiencing congestion. Hence, ARQ may not be suitable 
for applications with requirements of low latency, and FEC 
performs worse in networks with low bandwidth or which 
experience frequent congestion. In our architecture we 



adopted ARQ with NACK (using RM cells) to alleviate the 
effect of implosion. Support for reliable multipoint cannot be 
based on retransmissions from the source. In TAP, the 
intermediate active nodes carry out the retransmissions. 

The most commonly proposed congestion control schemes 
to improve throughput and fairness, while minimizing delay 
in ATM networks, are the Random Cell Discard (RCD), 
Partial Packet Discard (PPD), Early Packet Discard (EPD) 
[7], Early Selective Packet Discard (ESPD), Fair Buffer 
Allocation (FBA) and Random Early Detection (RED). TAP 
uses a modified version of EPD (which we have denominated 
Early Packet Discard and Relay, EPDR) to alleviate the effect 
of congestion and packet fragmentation. 

Nowadays, congestion control is achieved by delegated 
relay on end-to-end protocols, such as TCP. This is an easily 
implemented technique at high speeds and also simplifies the 
switches, but all the network is overcharged with 
retransmissions and does not achieve protection against 
egoistic traffic sources. Fair bandwidth schemes protect 
wellbehaved sources from misbehaved customer ones, and 
allow a diverse set of end-to-end congestion control 
mechanisms. We provide support for fair bandwidth 
allocations based on a delegated and modified WFQ 
(Weighted Fair Queueing) [8] scheme to reduce its 
implementation complexity. 

ATM Adaptation Layer type 5 (AAL-5) has been 
developed to support transfers of non-assured data user 
frames, where the lost and corrupted Common Part 
Convergence Sublayer Service Data Unit (CPCS-SDU) 
cannot be solved with retransmission [1]. We propose EAAL- 
5 as an extended and enhanced native AAL-5. EAAL-5 is 
part of TAP that supports assured service with 
retransmissions and is aiso compatible with native AAL-5. In 
this paper we propose a mechanism to take advantage of the 
idle periods in the data sources to retransmit the Common 
Part Convergence Sublayer PDU (CPCS-PDU) of EAAL-5. 

Active, open and programmable networks is a new 
technical area [9-19] to explore ways in which network 
elements may be dynamically ^programmed by network 
managers, network operators or general users to accomplish 
the required QoS and other features such as customized 
services. This offers attractive advantages, but also important 
challenges in aspects such as performance, security or 
reliability. Hence, this is an open issue for research and 
development in customized routing and protocols, whether to 
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move the service code (placed outside the transport network) 
to the network's switching nodes. Many of the advantages of 
active protocols are achieved by installing active nodes at 
strategic points. Concepts such as active networks, protocol 
boosters or software agents are proposed and developed for 
IP networks; however the proposals are insufficient for ATM 
networks. 

We have simulated and studied transfers combined with the 
active switch and other non-active ATM switches to 
constitute a VPN (Virtual Private Network) as we can see in 
Fig. 1 . Our goal is to use the TAP architecture to approximate 
the funcionality of a network in which all switches do not 
need to implement TAP. Java Development Kit VI. 2.1 has 
been the language and environment used to implement TAP 
due to the special characteristics offered by Java. 

Section 2 describes TAP architecture. In sections 3 and 4, 
we preserir AcTMs, our prototype of an active switch, that 
support the architecture and the protocol. Section 5 describes 
and outlines our work currently in progress to enhance TAP. 
Finally we offer some concluding remarks in section 6. 

11. GENERAL DESCRIPTION OF TAP ARCHITECTURE 

AAL-5 was proposed [1] to reduce overhead introduced by 
AAL3/4. The CPCS-PDU format of native AAL-5 and 
EAAL-5 has equal fields. The tail of PDU has 4 fields. The 
CPCS-UU (User-to-User indication) field is used for the 
transfer of CPCS user to user information. The CPI (Common 
Part Indication) octet is used to align 64 bits to the CPCS- 
PDU tail. TAP utilizes these two octets as the PDU sequence 
number, which is assigned end-to-end by the EAAL-5 user. 
The CRC is used as in AAL-5 to detect bit errors in the 
CPCS-PDU. The value of CRC is calculated including all the 
fields of the CPCS-PDU, The sequence number in PDUs is 
preserved end-to-end to avoid recalculating the CRC and 
modifying the tail of the CPCS-PDUs. 

To implement NACK we use the standard Resource 
Management (RM) cells, without fixed frequency but 
generated when a switch is congested. This is to alleviate the 
negative overhead effect due to a fixed number of RM cells 
that will waste bandwidth. 

When congestion is detected, EPDR discards a PDU. Then 
the CCA Agent searches for the discarded PDU in the 
DMTE. If this PDU is not in the local DMTE, the RM Agent 
of the active node generates a RM cell which is transmitted 
backwards to the upstream active switch indicating the 
sequence number of the discarded PDU. 
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The RM must also contain the VPI/VCI to identify the 
connection that has experienced discard problems. This 
mechanism is required to relate the sources of traffic with 
their I/O ports to alleviate the effect of equal values in 
VPI/VCI at different ports. Octets 22 to 51 of the RM cell 
[14] store the identifier Port/VPI/VCI/PDUid of requested 
PDUs. 

When the RM cell arrives at an active switch, the TAP 
searches the requested PDU and. if it is still in DMTE, then 
the PDU is retransmitted, as long as the idle time for the 
connection is sufficient. 

When a NACK (RM cell) arrives at a non-active switch 
this only processes the RM cell and resends it to its 
neighboring switch in the direction of the closest upstream 
active switch. The non-active switches do not have DMTE to 
retrieve PDUs and their function is only to send (or resend) 
PDUs forward to their destination and also send NACKs 
backwards to the active nodes. 

To conclude this point we emphasize that TAP cannot offer 
complete reliability, but assures and recovers an important 
number of PDUs that otherwise would be lost by congestion. 
The mechanism also guarantees that there are no end-to-end 
retransmissions but between active peer nodes. The 
retransmission mechanism generates unordered PDUs at the 
receiver. The protocol offers two kind of service. Hie first 
one named SEQ (sequential) sorts the PDUs and when it 
detects a sequence failure, assumes that the PDU is lost and 
leaves the retrieval to protocols of upper layers (i.e. TCP). 
The second type of service is unordered (connectionless) and 
does not do any kind of sorting. These two services are 
offered by the proposed EAAL-5. 

III. AcTMs (Active ATM Switches) NODES 

Fig. 2 plots the TAP architecture including the DMTE 
memory and the agents. The trusted sources generate their 
flow that arrives at the DMTE and which is processed to the 
output buffer that multiplexes the cells to the next switch. 

The architecture of the ATM switch is similar to an output 
buffered switch with VC-merging capabilities. We propose 
the AcTMs model of switch able to support the TAP 
architecture and protocol, which has the next main sections. 

Trotted transfer! CSA ^^f^r^— Jkjck prm&J source* 
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Fig. 1. Virtual Private Network with TAP. 



Fig. 2. TAP architecture over AcTMs switch. 
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A DMTE (Dynamic Memory of Trusted EAAL-5 PDUs) 

The DMTE is the key module. It behaves as a common 
shared memory and VC-merge buffer at the same time. The 
main function of this memory is to store temporarily the 
PDUs in the active switch after they have been transmitted to 
the output buffer, so that they can be requested in 
retransmissions. The TAP protocol keeps a copy of each PDU 
that arrives at the active switch (while VC merging is 
performed). Several PDUs are stored for each privileged 
connection. When a complete PDU arrives at the DMTE it is 
"copied" completely into the output buffer. The PDU remains 
in the DMTE until free space is needed for storing a new 
PDU, 

Due lo the big size of PDU-AAL-5 (up to 65,535 bytes), 
and the potential high number of connections (VPI/VCI), the 
size of the DMTE may be excessive. This is the reason why 
we limit the number of stored PDUs for each connection, and 
furthermore, only support a reduced number of privileged 
VCIs with trusted transfers. The traffic of a connection is 
more trusted when the DMTE stores more PDUs of this 
connection. But the size of DMTE also depends on the size of 
the PDUs, and we know that this is variable. We have 
calculated the required size of the memory to guarantee a 
concrete number of connections. Table I shows some of the 
obtained results and we can see that in order to offer trusted 
transfers to 1000 sources it only needs 3 Mbytes of memory 
that stores 2 PDUs of 1500 bytes for each privileged 
connection. 

The TAP accesses the DMTE through an index consisting 
of the port number, the PDUid (which corresponds to the UU 
and CPI fields in AAL5) and the VPI/VCI, and we have 
implemented different mechanisms to optimize the 
management and storage of PDUs. 

While a PDU from the DMTE is being retransmitted, if 
there is an incoming PDU with the same VPI/VCI and no free 
space is available in the DMTE, the incoming PDU is 
discarded. It is similar to a loss caused by the lack of VC- 
merge buffers. 

B. Input buffer 

As we can see in Fig. 2, the ATM cells arrive and are 
multiplexed over the input buffer where the cells are 
reassembled to build the PDUs. We use a size of 3000 octets, 
but the TAP simulator provides features to customize the 
number of cells that the input buffer stores. 



TABLE I 

REQUIRED DMTE SEE TO GUARANTEE X CONNECTIONS 
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1.5 Kb 
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450 Kbytes 
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1.5 Kb 
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2 


1.5 Kb 


1000 


3 Mbytes 



Every traffic management scheme requires queue 
management. Different methods for managing queues have 
different effects on traffic that flows through the queues. We 
know that Work conserving systems (FIFO) send the PDUs 
once the switch has completed the time of service. Hence, the 
server may not be idle if there arc PDUs in a queue. On the 
other hand, the Non-work conserving schemes waits a random 
amount of time before serving the next PDU in a queue, even 
if PDUs are waiting in the queue. While packet switching 
networks use window-based flow control (FIFO), the high 
speed networks need rate-based mechanisms and use work- 
conserving services and mechanisms such as Fair Queueing 
(FQ). FQ waits n-1 bits times before sending and has the 
problem that every source has the same fraction of 
bandwidth. However Weighted Fair Queuing (WFQ) offers 
strong performance guarantees. Although algorithms 
designed to achieve fair bandwidth allocations provide many 
desirable advantages for congestion control, their 
implementation complexity (per-flow scheduling, per-flow 
buffer management and per-PDUs classification) is an 
important obstacle to their application in high-speed 
networks. We propose a WFQ scheme that works by 
delegation over the CSA agent as we can see in section IV. 

C Input/Output Tables 

Each output port in AcTMs has its corresponding 
Input/Output table that stores the access index to the DMTE. 
When a PDU is sent to its output port, the I/O table that stores 
the InPort/VPIIn/VClln/PDUid/VPIOut/VCIOut/OutPort 
index is previously updated. This avoids equal 
PDUidentifiers and also provides direct access to PDUs into 
DMTE. 

IV. MULTI-AGENT AND DISTRIBUTED SYSTEM 

We bring active characteristics to TAP through hardware 
mechanisms and software techniques. TAP architecture 
includes the CCA agent (Control Congestion Agent) to 
manage the retransmission requests between peer active 
switches, the DMA agent (Dynamic Memory Agent) to 
recover PDU from the DMTE and also the CSA and DPA 
agents. 

An active network is a programmable network that allows 
code to be loaded dynamically into network nodes at run- 
time. The literature on active networks studies several 
mechanisms to obtain advantage from active - nodes. 
However, the proposals are insufficient for ATM networks 
and references [10-19] are some examples of this recent 
research in ATM. 

There is no consensus on deciding when a network is 
active. There are two great tendencies: a network is active if it 
incorporates active nodes with the capacity to execute a 
user's program, or if it implements mechanisms of code 
propagation. The TAP architecture is active in both trends, 
because it provides active nodes at strategic points that 
implement an active protocol to allow user's code to be 
loaded dynamically into network nodes at run-time. TAP also 
provides support for code propagation in the network thanks 
to the RM cells. TAP is also a distributed architecture in the 
sense that the protocol uses several active coordinated and 
self-collaborative agents. 
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The ATM switch of our model network is an output 
buffered switch that just reads VPI/VCI information from 
arriving cells and forwards them to the corresponding output 
port. But we equipped this switch with active hardware (as 
we just see in section III) and software techniques (as now we 
will see). The TAP architecture uses four agents to perform 
the following functions. 

A. CSA (Class of Service Agent) 

This programmable agent supports by delegation the WFQ 
scheme. The TAP protocol extends the flow-based WFQ 
functionality to provide support for user-defined classes of 
service. CSA allows the definition of classes of service with a 
list of parameters for each class such as PCR, size of PDUs, 
QoS, exact amount of bandwidth, VPI/VCI identifiers, Ton, 
Toff', queue limit size, etc. When the list of parameters is 
defined in the transmitter, the CSA is coordinated with the 
following agents to guarantee the class of service defined 
end-to-end and the WFQ scheme provide congestion control 
and allocates bandwidth in a fair manner. 

B. CCA (Control Congestion Agent) 

The CCA programmable agent controls congestion based 
in EPDR and other algorithms that the network manager can 
choose. This agent monitors the output buffer and, when the 
occupancy is above the threshold, it discards any new 
incoming PDUs (packets). The last EAAL-5 cell contains the 
VPI and VCI in the header and the PDUid in the trailer of the 
EAAL-5-CPCS-PDU. The complete PDU is discarded as in 
EPD but the information about the VPI, VCI and PDUid is 
used to generate a request for the retransmission of this PDU. 
If the requested PDU is still in the local DMTE, it may be 
recovered and resent to the output buffer. Otherwise, the 
requests must be forwarded to the upstream active switch. 

Fig. 3 shows the EPD algorithm modified to carry out the 
retransmissions of PDUs when a congestion is detected at 
input buffer (Early Packet Discard and Relay). 



When a cell arrives at an ATM buffer; 
if the cell's VFI/VCI belongs to drop-lisi 
if the cell is iui EOM cell 

if Queue ^Length < Buffer _Size 

insert the cell into buffer 
get PDUid 

generates RM cell (Request PDUid) 

else 

discard the cell 
remove the VPI/VCI from the drop-list 

else 

discard the cell 

else 

if Queue _Length < Threshold 

insert the celt into buffer 
else if (BOM cell or (the buffer is full)) 

discard the celt 

capture the VPt/VCI into drop-list 
else 

accept the cell into the single buffer 
Fig. 3. Early Packet Discard and Relay Algorithm. 



Another important point is that CCAs do not perform a 
protocol in the classical way; that is, there is only one 
opportunity to recover a PDU. No sliding windows or timeout 
retransmissions are used in this proposal. 

C. DMA (Dynamic Memory Agent) 

The DMA agent is the access point to DMTE memory. The 
CCA is coordinated with the Dynamic Memory Agent to 
request retransmission of the PDU to the peer active switches. 
The function of the CCA agent is to generate native RM cells 
that are transmitted backwards to the upstream active switch. 
Non-active switches will recognize the RM cells as TAP RM 
cells and will not take any action on them; they will simply 
forward the RM ceils. 

When a CCA agent receives an RM cell it takes on the task 
of looking for the requested PDU in the DMTE memory 
using Port/VPI/VCUPDUid as the index. But this work is 
delegated over the DMA agent that uses the I/O tables to 
search the access index to the DMTE memory. If the PDU is 
still there it means that the connection cell flow presents an 
idle period and the PDU may be recovered. The PDU is sent 
to the DPA agent that dispaches it to their correspondent 
output port. When the PDU is not yet at DMTE the DMA 
agent notifies the CCA agent that is responsible for 
generating a RM cell backward to the previous ATM switch. 

We should recall that PDUids are assigned end-to-end for 
each VCC and there is not any change for misinterpretation 
of the requested PDU. If the cell flow of the connection is 
dense (very short idle periods between successive PDUs) then 
the new incoming PDUs will use the DMTE and the "old" 
PDUs will be removed. We are currently working on the use 
of RM cells as a transport mechanism to carry out code 
propagation between active nodes. This code contains 
instructions to optimize the retransmision of PDUs in 
multipoint connections. The CCA agents utilize these 
instructions to inspect the distribution tree at width providing 
better goodput in retransmissions. 

D. DPA (Dispacher PDUs Agent) 

This agent takes the complete PDUs of the input buffer 
and, when their Input/Output buffer is updated, the PDU is 
sent to the correct output port. DPA guarantees that the PDUs 
are sent completed to the output port. The congestion of the 
output port and also, the negative effect of merge cells 
belonging to different PDUs or sources, are avoided. 

V. PERFORMANCE EVALUATION 

Previous work [18,19] has presented and demonstrated the 
good behavior of RAP and TAP protocols over TAP 
architecture. We have simulated several software techniques 
to introduce active characteristics in switches. These 
mechanisms control and manage the privileged VCI and we 
will also offer an active mechanism to retrieve PDUs 
querying neighboring active switches and to search optimized 
paths when a PDU is retransmitted. 

The simulation allows us to define the congestion 
probability in transmitters, receivers and each ATM switch. 
When a node is undergoing congestion, it then requests the 
retransmission (NACK) of the corresponding PDU. 
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The simulation also permits the user to introduce variable 
values such as ON/OFF traffic source parameters, the number 
of transmitters and receivers, the number of non-active 
switches, size of DMTE and input buffer, etc. Also, the 
simulator offers the operator the management of the 
parameters of programmable agents such as CSA and CCA. 

In our simulation to analyse ATM cell loss we have used 
ON/OFF (bursty) traffic sources. The ON/OFF model [20,21] 
is used to characterise ATM traffic per unidirectional 
connection. Fig. 4 shows this model as a source which either 
actively sends (ON state) CSCP-PDU-AAL-5 data for some 
time Ton at a traffic rate R or PCR (Peak Cell Rate) or is 
silent (OFF state) producing no cells for some time Toff. 

The source also periodically generates empty time slots. 
We use in all examples a CSR (Cell Slot Rate) of C=353,208 
cell/s since our network model uses 155.52 Mbit/s links. 
When the CAR is less than the CSR, there are empty slots 
during the active states as we can see in Fig. 4. 

The cell inter-arrival time 1/CAR is the unit of time for the 
ON state, and the mean duration in the active state is 

Ton = (l/CAR)*Xoru (1) 
Also, the mean duration in the silent or idle state is 



Table II shows the maximum and minimum source traffic 
descriptors used in our simulation. We utilize a process that 
switches between an idle (silent) state, and the active state 
(sojourn time) which produces an average fixed rate of cells 
(between 64 Kbits/s to 25 Mbits/s) grouped in PDUs of 1,500 
bytes. During the ON states this process generates cells at a 
cell arrival rate CAR. 

Fig. 5 shows several identical sources, each operating 
independently over an AcTMs switch equipped with an ATM 
buffer of finite size X bytes and service capacity C 
cells/second receiving cells since the ON/OFF sources. The 
Peak Cell Rate is PCR cells/second, so the Mean Cell Rate 
(MCR) for each ON/OFF source is, 

MCR = PCR ( Ton/( Ton + Toff)), (3) 
The probability that the source is active, or activity factor, 



is, 



AF = MCR/ PCR = Ton/( Ton + Toff), 



(4) 



We can calculate how many times the peak rate PCR fits 
into the service capacity C, denoted by S, 



S=C/PCR. 



(5) 



Toff=( l/CSR)*Xoff 



(2) 



Empirical studies [20] demonstrate that Ton = 0.96 
seconds, and Toff = 1 .69 seconds, and we use these values in 
the simulation, although we have used other values to analyze 
its effect over TAP. We use these and other formulae to 
implement the sources of simulations. Note that we have 
varied some of these parameters to analyse the behaviour of 
the TAP when it changes the scenario and the source traffic 
descriptors as we show in this section. 



ON/OFF Source 



C 

Pon=0.99974 



CAR=3,907 cps 
tT«n=0.96s. TofT=l.<W&. 




PutT=0.99999832 

o 



fTll^ilTiimiPlI ITT 



CSR=353,208cp3 



Xun=3,751 c Xoff=S96.921 slots 
_TpfP 



JTmmTTLf 



1/CAR 




CSR=353,208 cps XBL= 155.52 Mbps 



Fig. 4. Cell pattern for a single ON/OFF source and example 
of simulation. 



In this way, in the example shown in Fig. 4, 

S - 353,208/3,751 = 94.1 ON/OFF trusted sources with 
PCR = 1.5 Mbps and multiplexed over a link bandwidth of 
155.52 Mbps. 

So, we will have enough agregated Toff time to retrieve 
PDUs when we have not exceeded 94 multiplexed sources. 
This gives the maximum number of sources that we can have 
in the system to take advantage of the inactivity time to 
retransmit congested PDUs. 

When TAP protocol detects that the agregated PCR of 
sources exceed the service capacity C, with burst scale 
queuing, it does not not request retransmissions of PDUs. So, 
this feature is accomplished by the CCA agent which, before 
requesting retransmissions, checks that these are possible in 
order to avoid waste network resources. 



TABLE II 

SOURCE TRAFFIC DESCRIPTORS ON/OFF 



Source traffic 


Parameter 


Minimun 


Maximum 


descriptor 








Bandwidth Source 


_BS 


64kbit7s. 


25 MbitVs. 


Cell arrival rate 


CAR 


167cells/s. 


65.105 cVs. 


Cell inter-amval time 


1/CAR 


6 ms. 


15 ^. 


Bandwidth link 


BL 


155.52 Mbps. 


622 Mbit/s. 


Cell slot rate 


CSR 


353,208 cell/s. 


1,412,648 cell/s. 


Service time per cell 


1/CSR 


2.83 Us. 


0.70 jls. 


Active time period 


To, 


0.96 s. 


1 s. 


Mean number of cells 


x« 


160 cells 


65,105 cells 


in an active state 








Time in idle state 


T* 


1.69 s. 


2 s. 


Mean number of 


Xoff 


596,92] cell 


2,825,296 cell 


empty slots in idle 




slots 


slots 


states 
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Fig. 5. Multiple ON/OFF sources. 

We now report some results from the simulation of the 
TAP protocol. 

Fig. 6 shows the effect of varying PCR between 86 and 
2,667 cells per second (33,000 Kbps to 1 Mbps respectively). 
In this simulation we fixed the congestion probability at 10 . 
We use an input buffer of 3,000 octets and the DMTE stores 
2 PDUs of 1,500 bytes for each connection. 

The value for PCR is 64 Kbps (167 cells/s.); Ton=0.96 s.; 
and Toff=1.69 s, over the 50 total PDUs discarded by 
congestion, 50 PDUs are retrieved via TAP. Also, when 
PCR=56 Kbps and 33 Kbps, TAP retrieves all the congesteds 
PDUs. Thus, the performance is optimized (50 retrieved 
PDUs out of 50 congested PDUs) since all the lost PDUs are 
retrieved and there are no DMTE failures (all the requested 
PDUs are in the DMTE). 

As we can see, when the arrival rate is low, the number of 
retrieved PDUs increases. When the PCR increases, 256 
Kbps, TAP retrieves 48 out of 50 PDUs, but the 2 lost PDUs 
are not requested because the protocol detects insufficient 
idle time (Toff) to do the retransmission. We can see how the 
number of NACKs not sent (Not requested PDUs) is greater 
when the PCR value increases. In this way, the network is not 
over-charged with useless retransmissions when there is not 
sufficient aggregate Toff. 

We note that at high PCR (1 Mbps) the number of retrieved 
PDUs is 47 and also the 3 not retrieved PDUs are not 
requested. As we can see the goodput is optimized when the 
number of trusted sources do not exceed the service capacity 
C (see Fig. 4 and 5). 




■Congestion PDUs 
1 Retrieved PDUs 
I Not requested PDUs 



256 512 1024 PCR (Kbps) 



Fig. 7 shows the results of varying the idle time (Toff) 
between 0.1 and 2 seconds. We now use 10 ON/OFF sources 
over a bandwidth link of 25 Mbps. Each source generates 500 
PDUs at PCR=1 Mbps. We fixed a Cell Loss Rate of 5 % 
over the total emitted PDUs and we have a constant value of 
25 congested PDUs. As we can see, when the agreggated 
Toff is sufficient (0.5 seconds), all the congested PDUs are 
retrieved (25 retrieved over 25 congested). When the Toff is 
less than 0.5 s. the retrieved PDUs fall to 12 PDUs at 0.3 s., 
and 3 retrieved PDUs at 0. 1 s. So we noted that, when there is 
insufficient Toff, the number of unretrieved PDUs increases, 
but TAP guarantee the goodput since the unrecoverable 
PDUs are not requested to avoid overcharging the network. 

Another scenario consists of 1 source node, I active ATM 
switch, n non-active switches and 1 destination node. When a 
NACK arrives at a non-active switch, this also transfers the 
RM cell to the next switch. When the RM arrives at the active 
switch this uses the DMTE to retransmit the requested PDU. 
This scenario is the same as above, only the number of non- 
active switches varies. In this configuration we have 
simulated the protocol with several non-active switches and 
the results obtained show no changes. Only the delay in 
transmissions varies due to propagation times, but the index 
of retrieved PDUs is maintained as we have already shown. 

Previous work [18,19] presents a point-multipoint 
configuration consisting of 1 source node, 1 active ATM 
switch, n non- active switches and n destination nodes. This is 
equal to the above basic scenario, only the number of 
destination nodes in multipoint connections varies. At present 
we are working to achieve multipoint connections to TAP. If 
we consider the above results we can see intuitively that the 
total delay will change. Also the amount of DMTE memory 
required increases in active switches to manage the VPI/VCI 
of n connections. 

However, we shall now describe several aspects on which 
we are working to achieve belter goodput. 

We will consider other source traffic descriptors such as 
SCR (Sustainable Cell Rate) and MBS (Maximum Burst 
Size). With these parameters we can characterize the traffic 
better. Also, like most applications used in the TCP protocol 
for transmission data in frame based structures, we are 
working to implement the Guaranteed Frame Rate (GFR) 
[22] service class to provide a minimum service guarantee for 
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Fig. 6. Number of retrieved PDUs for different PCRs. 



Fig. 7. Effect of variation Toff. 
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UBR, VBR and ABR services. In order to support GFR we 
will simulate sources with a Minimum Cell Rate (MCR) 
guarantee for a given MBS and Maximum Frame Size 
(MFS). With the GFR service class TAP will guarantee that 
is able to distinguish eligible and non-eligible frames and also 
to discard cells properly. We are currently working to 
enhance the architecture including other intelligent agents to 
characterize the traffic and their class of service. 



VI. SUMMARY 

In this paper we have presented TAP as the architecture for 
an active protocol that can take advantage of suitably 
equipped active ATM switches. TAP manages a set of 
privileged VCIs to improve trusted connections when the 
switches are congested. To achieve these active 
characteristics we use AcTMs (active ATM switches) 
equipped with little support hardware and software agents of 
reduced implementation complexity. We have verified that it 
is possible to retrieve an important number of PDUs only 
with DMTE and a reasonable additional complexity of the 
AcTMs switches supported by software agents that 
implement variants of EPD to solve congestions, and WFQ to 
achieve fair allocation. The retransmission mechanism is 
based on ARQ with NACFC that generates RM cells to request 
PDUs. Our simulations demonstrate that the intuitive idea of 
taking advantage of silent states in ON/OFF sources is valid. 
Thus we can achieve better goodput and QoS in ATM 
networks. 
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Abstract-TXS* (Trusted and Active PDU transfers) is a new 
distributed architecture and protocol for ATM networks that 
provides assured transfers to a set of privileged VPI/VCI. The 
distributed architecture proposes a new Extended AAL type 5 
(EAAL-5), manages the privileged connections and offers an 
improvement in the performance when network connections 
cause some cell loss by taking advantage of the idle time in the 
traffic sources to carry out the retransmissions of CPCS-PDU- 
EAAL5. The trusted protocol is supported by our AcTMs 
(active ATM switch) model that we have equipped with 
hardware techniques and active software to achieve our 
objectives. Several simulations (using ON/OFF sources) 
demonstrate the effectiveness of the mechanism that recovers 
the congested PDUs locally at the congested switches with better 
goodput in the network. Also, the senders are alleviated of 
negative end-to-end retransmissions. The TAP is an active and 
distributed architecture in the sense that our protocol 
implements several active coordinated and self- collaborative 
software agents. 

I. INTRODUCTION AND RELATED WORK 

Reliability in ATM networks is provided by the Header 
Error Control (HEC) field of 8 bits in the header of the ATM 
cells and by the Cyclic Redundancy Check (CRC) in the 
Common Sublayer-Protocol Data Unit (CS-PDU). Error 
control is performed end-to-end by the terminals. The main 
problem is that a single cell loss causes a reassembly CRC 
error at AAL-5 level, which in turn leads to a retransmission 
of a complete PDU (i.e., IP datagram). 

ATM networks experience three types of errors [1-2]: cell 
losses due to congestion in switches; corruption of data 
portions due to bit errors, and switching errors due to 
undetected corruption of the cell header. We note that 
congestion is by far the most common type of error, and here 
is where we want to improve the trusted transfers with TAP. 

Current literature describes three basic techniques to 
achieve reliability: ARQ [3] (Automatic Repeat Request), 
FEC [4-6] (Forward Error Correction) and hybrid 
mechanisms of ARQ in combination with FEC, 

While ARQ adds latency (due to the cost of NACK) and 
implosion, FEC adds overhead and thus the redundant code 
added by this method is useless when the network is 
experiencing congestion. Hence, ARQ may not be suitable 
for applications with requirements of low latency, and FEC 
performs worse in networks with low bandwidth or which 
experience frequent congestion. In our architecture we 



adopted ARQ with NACK (using RM cells) to alleviate the 
effect of implosion. Support for reliable multipoint cannot be 
based on retransmissions from the source. In TAP, the 
intermediate active nodes carry out the retransmissions. 

The most commonly proposed congestion control schemes 
to improve throughput and fairness, while minimizing delay 
in ATM networks, are the Random Cell Discard (RCD), 
Partial Packet Discard (PPD), Early Packet Discard (EPD) 
[7], Early Selective Packet Discard (ESPD), Fair Buffer 
Allocation (FBA) and Random Early Detection (RED). TAP 
uses a modified version of EPD (which we have denominated 
Early Packet Discard and Relay, EPDR) to alleviate the effect 
of congestion and packet fragmentation. 

Nowadays, congestion control is achieved by delegated 
relay on end-to-end protocols, such as TCP. This is an easily 
implemented technique at high speeds and also simplifies the 
switches, but all the network is overcharged with 
retransmissions and does not achieve protection against 
egoistic traffic sources. Fair bandwidth schemes protect 
wellbehaved sources from misbehaved customer ones, and 
allow a diverse set of end-to-end congestion control 
mechanisms. We provide support for fair bandwidth 
allocations based on a delegated and modified WFQ 
(Weighted Fair Queueing) [8] scheme to reduce its 
implementation complexity. 

ATM Adaptation Layer type 5 (AAL-5) has been 
developed to support transfers of non-assured data user 
frames, where the lost and corrupted Common Part 
Convergence Sublayer Service Data Unit (CPCS-SDU) 
cannot be solved with retransmission [1]. We propose EAAL- 
5 as an extended and enhanced native AAL-5. EAAL-5 is 
part of TAP that supports assured service with 
retransmissions and is also compatible with native AAL-5. In 
this paper we propose a mechanism to take advantage of the 
idle periods in the data sources to retransmit the Common 
Part Convergence Sublayer PDU (CPCS-PDU) of EAAL-5. 

Active, open and programmable networks is a new 
technical area [9-19] to explore ways in which network 
elements may be dynamically ^programmed by network 
managers, network operators or general users to accomplish 
the required QoS and other features such as customized 
services. This offers attractive advantages, but also important 
challenges in aspects such as performance, security or 
reliability. Hence, this is an open issue for research and 
development in customized routing and protocols, whether to 
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move the service code (placed outside the transport network) 
to the network's switching nodes. Many of the advantages of 
active protocols are achieved by installing active nodes at 
strategic points. Concepts such as active networks, protocol 
boosters or software agents are proposed and developed for 
IP networks; however the proposals are insufficient for ATM 
networks. 

We have simulated and studied transfers combined with the 
active switch and other non-active ATM switches to 
constitute a VPN (Virtual Private Network) as we can see in 
Fig. 1. Our goal is to use the TAP architecture to approximate 
the funcionality of a network in which all switches do not 
need to implement TAP. Java Development Kit VI. 2.1 has 
been the language and environment used to implement TAP 
due to the special characteristics offered by Java. 

Section 2 describes TAP architecture. In sections 3 and 4, 
we 'present" AcTMs, our prototype of an active switch, that 
support the architecture and the protocol. Section 5 describes 
and outlines our work currently in progress to enhance TAP. 
Finally we offer some concluding remarks in section 6. 

II. GENERAL DESCRIPTION OF TAP ARCHITECTURE 

AAL-5 was proposed [1] to reduce overhead introduced by 
AAL3/4. The CPCS-PDU format of native AAL-5 and 
EAAL-5 has equal fields. The tail of PDU has 4 fields. The 
CPCS-UU (User-to-User indication) field is used for the 
transfer of CPCS user to user information. The CPI (Common 
Part Indication) octet is used to align 64 bits to the CPCS- 
PDU tail. TAP utilizes these two octets as the PDU sequence 
number, which is assigned end-to-end by the EAAL-5 user. 
The CRC is used as in AAL-5 to detect bit errors in the 
CPCS-PDU. The value of CRC is calculated including all the 
fields of the CPCS-PDU. The sequence number in PDUs is 
preserved end-to-end to avoid recalculating the CRC and 
modifying the tail of the CPCS-PDUs. 

To implement NACK we use the standard Resource 
Management (RM) cells, without fixed frequency but 
generated when a switch is congested. This is to alleviate the 
negative overhead effect due to a fixed number of RM cells 
that will waste bandwidth. 

When congestion is detected, EPDR discards a PDU. Then 
the CCA Agent searches for the discarded PDU in the 
DMTE. If this PDU is not in the local DMTE, the RM Agent 
of the active node generates a RM cell which is transmitted 
backwards to the upstream active switch indicating the 
sequence number of the discarded PDU. 
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The RM must also contain the VPWCI to identify the 
connection that has experienced discard problems. This 
mechanism is required to relate the sources of traffic with 
their I/O ports to alleviate the effect of equal values in 
VPWCI at different ports. Octets 22 to 51 of the RM cell 
[14] store the identifier Port/VPI/VCI/PDUid of requested 
PDUs. 

When the RM cell arrives at an active switch, the TAP 
searches the requested PDU and, if it is still in DMTE, then 
the PDU is retransmitted, as long as the idle time for the 
connection is sufficient. 

When a NACK (RM cell) arrives at a non-active switch 
this only processes the RM cell and resends it to its 
neighboring switch in the direction of the closest upstream 
active switch. The non-active switches do not have DMTE to 
retrieve PDUs and their function is only to send (or resend) 
PDUs forward to their destination and also send NACKs 
backwards to the active nodes. 

To conclude this point we emphasize that TAP cannot offer 
complete reliability, but assures and recovers an important 
number of PDUs that otherwise would be lost by congestion. 
The mechanism also guarantees that there are no end-to-end 
retransmissions but between active peer nodes. The 
retransmission mechanism generates unordered PDUs at the 
receiver. The protocol offers two kind of service. The first 
one named SEQ (sequential) sorts the PDUs and when it 
detects a sequence failure, assumes that the PDU is lost and 
leaves the retrieval to protocols of upper layers (i.e. TCP). 
The second type of service is unordered (connectionless) and 
does not do any kind of sorting. These two services are 
offered by the proposed EAAL-5. 

III. AcTMs (Active ATM Switches) NODES 

Fig. 2 plots the TAP architecture including the DMTE 
memory and the agents. The trusted sources generate their 
flow that arrives at the DMTE and which is processed to the 
output buffer that multiplexes the cells to the next switch. 

The architecture of the ATM switch is similar to an output 
buffered switch with VC-merging capabilities. We propose 
the AcTMs model of switch able to support the TAP 
architecture and protocol, which has the next main sections. 

Trusted tnuufcn — f- 8 * JUyt gmnoJ unrcct 
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Fig. 1. Virtual Private Network with TAP. 



Fig. 2. TAP architecture over AcTMs switch. 
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A. DMTE (Dynamic Memory of Trusted EAAL-5 PDUs) 

The DMTE is the key module. It behaves as a common 
shared memory and VC-merge buffer at the same time. The 
main function of this memory is to store temporarily the 
PDUs in the active switch after they have been transmitted to 
the output buffer, so that they can be requested in 
retransmissions. The TAP protocol keeps a copy of each PDU 
that arrives at the active switch (while VC merging is 
performed). Several PDUs are stored for each privileged 
connection. When a complete PDU arrives at the DMTE it is 
"copied" completely into the output buffer. The PDU remains 
in the DMTE until free space is needed for storing a new 
PDU. 

Due to the big size of PDU-AAL-5 (up to 65,535 bytes), 
and the potential high number of connections (VPI/VCI), the 
size of the DMTE may be excessive. This is the reason why 
we limit the number of stored PDUs for each connection, and 
furthermore, only support a reduced number of privileged 
VCIs with trusted transfers. The traffic of a connection is 
more trusted when the DMTE stores more PDUs of this 
connection. But the size of DMTE also depends on the size of 
the PDUs, and we know that this is variable. We have 
calculated the required size of the memory to guarantee a 
concrete number of connections. Table I shows some of the 
obtained results and we can see that in order to offer trusted 
transfers to 1000 sources it only needs 3 Mbytes of memory 
that stores 2 PDUs of 1500 bytes for each privileged 
connection. 

The TAP accesses the DMTE through an index consisting 
of the port number, the PDUid (which corresponds to the UU 
and CPI fields in AAL5) and the VPI/VCI, and we have 
implemented different mechanisms to optimize the 
management and storage of PDUs. 

While a PDU from the DMTE is being retransmitted, if 
there is an incoming PDU with the same VPI/VCI and no free 
space is available in the DMTE, the incoming PDU is 
discarded. It is similar to a loss caused by the lack of VC- 
merge buffers. 

B. input buffer 

As we can see in Fig. 2, the ATM cells arrive and are 
multiplexed over the input buffer where the cells are 
reassembled to build the PDUs. We use a size of 3000 octets, 
but the TAP simulator provides features to customize the 
number of cells that the input buffer stores. 



TABLE I 

REQUIRED DMTE SIZE TO GUARANTEE X CONNECTIONS 



PDUs stored per pDUsi2e 
connection 



Trusted 
connections 



DMTE size 



3 


64Kb 


10 


1.9 Mbytes 


3 


64Kb 


20 


3.8 Mbytes 


3 


1.5 Kb 


10 


45 Kbytes 


3 


1.5 Kb 


100 


450 Kbytes 


3 


1.5 Kb 


1000 


4.5 Mbytes 


2 


1.5 Kb 


1000 


3 Mbytes 



Every traffic management scheme requires queue 
management. Different methods for managing queues have 
different effects on traffic that flows through the queues. We 
know that Work conserving systems (FIFO) send the PDUs 
once the switch has completed the time of service. Hence, the 
server may not be idle if there arc PDUs in a queue. On the 
other hand, the Non-work conserving schemes waits a random 
amount of time before serving the next PDU in a queue, even 
if PDUs are waiting in the queue. While packet switching 
networks use window-based flow control (FIFO), the high 
speed networks need rate-based mechanisms and use work- 
conserving services and mechanisms such as Fair Queueing 
(FQ). FQ waits n-1 bits times before sending and has the 
problem that every source has the same fraction of 
bandwidth, However Weighted Fair Queuing (WFQ) offers 
strong performance guarantees. Although algorithms 
designed to achieve fair bandwidth allocations provide many 
desirable advantages for congestion control, their 
implementation complexity (per-flow scheduling, per-flow 
buffer management and per-PDUs classification) is an 
important obstacle to their application in high-speed 
networks. We propose a WFQ scheme that works by 
delegation over the CSA agent as we can see in section TV. 

C. Input/Output Tables 

Each output port in AcTMs has its corresponding 
Input/Output table that stores the access index to the DMTE. 
When a PDU is sent to its output port, the I/O table that stores 
the InPort/VPUn/VCUn/PDUid/VPlOut/VCIOut/OutPort 
index is previously updated. This avoids equal 
PDUidentifiers and also provides direct access to PDUs into 
DMTE. 

IV. MULTI-AGENT AND DISTRIBUTED SYSTEM 

We bring active characteristics to TAP through hardware 
mechanisms and software techniques. TAP architecture 
includes the CCA agent (Control Congestion Agent) to 
manage the retransmission requests between peer active 
switches, the DMA agent (Dynamic Memory Agent) to 
recover PDU from the DMTE and also the CSA and DPA 
agents. 

An active network is a programmable network that allows 
code to be loaded dynamically into network nodes at run- 
time. The literature on active networks studies several 
mechanisms to obtain advantage from active • nodes. 
However, the proposals are insufficient for ATM networks 
and references [10-19] are some examples of this recent 
research in ATM. 

There is no consensus on deciding when a network is 
active. There are two great tendencies: a network is active if it 
incorporates active nodes with the capacity to execute a 
user's program, or if it implements mechanisms of code 
propagation. The TAP architecture is active in both trends, 
because it provides active nodes at strategic points that 
implement an active protocol to allow user's code to be 
loaded dynamically into network nodes at run-time. TAP also 
provides support for code propagation in the network thanks 
to the RM cells. TAP is also a distributed architecture in the 
sense that the protocol uses several active coordinated and 
self-collaborative agents. 
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The ATM switch of our model network is an output 
buffered switch that just reads VPI/VCI information from 
arriving cells and forwards them to the corresponding output 
port. But we equipped this switch with active hardware (as 
we just see in section III) and software techniques (as now we 
will see). The TAP architecture uses four agents to perform 
the following functions. 

A. CSA (Class of Service Agent) 

This programmable agent supports by delegation the WFQ 
scheme. The TAP protocol extends the flow-based WFQ 
functionality to provide support for user-defined classes of 
service. CSA allows the definition of classes of service with a 
list of parameters for each class such as PCR, size of PDUs, 
QoS, exact amount of bandwidth, VPI/VCI identifiers, Ton, 
Toff, queue limit size, etc. When the list of parameters is 
defined in the transmitter, the CSA is coordinated with the 
following agents to guarantee the class of service defined 
end-to-end and the WFQ scheme provide congestion control 
and allocates bandwidth in a fair manner. 

B. CCA (Control Congestion Agent) 

The CCA programmable agent controls congestion based 
in EPDR and other algorithms that the network manager can 
choose. This agent monitors the output buffer and, when the 
occupancy is above the threshold, it discards any new 
incoming PDUs (packets). The last EAAL-5 cell contains the 
VPI and VCI in the header and the PDUid in the trailer of the 
EAAL-5-CPCS-PDU. The complete PDU is discarded as in 
EPD but the information about the VPI, VCI and PDUid is 
used to generate a request for the retransmission of this PDU. 
If the requested PDU is still in the local DMTE, it may be 
recovered and resent to the output buffer. Otherwise, the 
requests must be forwarded to the upstream active switch. 

Fig. 3 shows the EPD algorithm modified to carry out the 
retransmissions of PDUs when a congestion is detected at 
input buffer (Early Packet Discard and Relay). 



Wlten a cell arrives at an ATM buffer: 
if the celt's VPI/VCI belongs to drvp4isl 
if the cell is an EOM cell 

if Queue ^Length < Buffer _Size 

insert the cell into buffer 
get PDUid 

generates RM cell (Request PDUid) 

else 

discard the cell 
remove the VPUVCl from the drop-list 

else 

discard the cell 

else 

if Queue _Lengrh < Threshold 

insert the celt into buffer 
else if (BOM cell or (the buffer is full)) 

discard the celt 

capture the VPI/VCI into drop-list 
else 

accept the cell into the single buffer 
Fig. 3. Early Packet Discard and Relay Algorithm. 



Another important point is that CCAs do not perform a 
protocol in the classical way; that is, there is only one 
opportunity to recover a PDU. No sliding windows or timeout 
retransmissions are used in this proposal. 

C. DMA (Dynamic Memory Agent) 

The DMA agent is the access point to DMTE memory. The 
CCA is coordinated with the Dynamic Memory Agent to 
request retransmission of the PDU to the peer active switches. 
The function of the CCA agent is to generate native RM cells 
that are transmitted backwards to the upstream active switch. 
Non-active switches will recognize the RM cells as TAP RM 
cells and will not take any action on them; they will simply 
forward the RM cells. 

When a CCA agent receives an RM cell it takes on the task 
of looking for the requested PDU in the DMTE memory 
using Port/VPl/VCUPDUid as the index. But this work is 
delegated over the DMA agent that uses the I/O tables to 
search the access index to the DMTE memory. If the PDU is 
still there it means that the connection cell flow presents an 
idle period and the PDU may be recovered. The PDU is sent 
to the DPA agent that dispaches it to their correspondent 
output port. When the PDU is not yet at DMTE the DMA 
agent notifies the CCA agent that is responsible for 
generating a RM cell backward to the previous ATM switch. 

We should recall that PDUids are assigned end-to-end for 
each VCC and there is not any change for misinterpretation 
of the requested PDU. If the cell flow of the connection is 
dense (very short idle periods between successive PDUs) then 
the new incoming PDUs will use the DMTE and the "old" 
PDUs will be removed. We are currently working on the use 
of RM cells as a transport mechanism to carry out code 
propagation between active nodes. This code contains 
instructions to optimize the retransmision of PDUs in 
multipoint connections. The CCA agents utilize these 
instructions to inspect the distribution tree at width providing 
better goodput in retransmissions. 

D. DPA (Dtspacher PD Us Agent) 

This agent takes the complete PDUs of the input buffer 
and, when their Input/Output buffer is updated, the PDU is 
sent to the correct output port. DPA guarantees that the PDUs 
are sent completed to the output port. The congestion of the 
output port and also, the negative effect of merge cells 
belonging to different PDUs or sources, are avoided. 

V. PERFORMANCE EVALUATION 

Previous work [18,19] has presented and demonstrated the 
good behavior of RAP and TAP protocols over TAP 
architecture. We have simulated several software techniques 
to introduce active characteristics in switches. These 
mechanisms control and manage the privileged VCI and we 
will also offer an active mechanism to retrieve PDUs 
querying neighboring active switches and to search optimized 
paths when a PDU is retransmitted. 

The simulation allows us to define the congestion 
probability in transmitters, receivers and each ATM switch. 
When a node is undergoing congestion, it then requests the 
retransmission (NACK) of the corresponding PDU. 
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The simulation also permits the user to introduce variable 
values such as ON/OFF traffic source parameters, the number 
of transmitters and receivers, the number of non-active 
switches, size of DMTE and input buffer, etc. Also, the 
simulator offers the operator the management of the 
parameters of programmable agents such as CSA and CCA. 

In our simulation to analyse ATM cell loss we have used 
ON/OFF (bursty) traffic sources. The ON/OFF model [20,21] 
is used to characterise ATM traffic per unidirectional 
connection. Fig. 4 shows this model as a source which either 
actively sends (ON state) CSCP-PDU-AAL-5 data for some 
time Ton at a traffic rate R or PCR (Peak Cell Rate) or is 
silent (OFF state) producing no cells for some time Toff. 

The source also periodically generates empty time slots. 
We use in all examples a CSR (Cell Slot Rate) of 0=353,208 
cell/s since our network model uses 155.52 Mbit/s links. 
When the CAR is less than the CSR, there are empty slots 
during the active states as we can see in Fig. 4. 

The cell inter-arrival time 1/CAR is the unit of time for the 
ON state, and the mean duration in the active state is 

Ton = (l/CAR)*Xoru (1) 
Also, the mean duration in the silent or idle state is 



Toff=( l/CSR)*Xoff 



(2) 



Empirical studies [20] demonstrate that Ton = 0.96 
seconds, and Toff = 1 .69 seconds, and we use these values in 
the simulation, although we have used other values to analyze 
its effect over TAP. We use these and other formulae to 
implement the sources of simulations. Note that we have 
varied some of these parameters to analyse the behaviour of 
the TAP when it changes the scenario and the source traffic 
descriptors as we show in this section. 



ON/OFF Source 




CAR=3,907 cps 
*Ton=0.96 s. TofT=1.69 t>. 
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XofT=596.921 slots 



1/CAR 1/CSR 



BL=155.52 Mbps 




CSR=353>208 cps iBL= 155.52 Mbps 



Fig. 4. Cell pattern for a single ON/OFF source and example 
of simulation. 



Table II shows the maximum and minimum source traffic 
descriptors used in our simulation. We utilize a process that 
switches between an idle (silent) state, and the active state 
(sojourn time) which produces an average fixed rate of cells 
(between 64 Kbits/s to 25 Mbits/s) grouped in PDUs of 1,500 
bytes. During the ON states this process generates cells at a 
cell arrival rate CAR. 

Fig. 5 shows several identical sources, each operating 
independently over an AcTMs switch equipped with an ATM 
buffer of finite size X bytes and service capacity C 
cells/second receiving cells since the ON/OFF sources. The 
Peak Cell Rate is PCR cells/second, so the Mean Cell Rate 
(MCR) for each ON/OFF source is, 

MCR = PCR (Ton/( Ton + Toff)). (3) 
The probability that the source is active, or activity factor, 



is, 



AF = MCR /PCR = Ton/( Ton + Toff). 



(4) 



We can calculate how many times the peak rate PCR fits 
into the service capacity C, denoted by S, 



S^C/PCR. 



(5) 



In this way, in the example shown in Fig. 4, 



5 = 353,208/3,751 = 94.1 ON/OFF trusted sources with 
PCR =1.5 Mbps and multiplexed over a link bandwidth of 
155.52 Mbps. 

So, we will have enough agregated Toff time to retrieve 
PDUs when we have not exceeded 94 multiplexed sources. 
This gives the maximum number of sources that we can have 
in the system to take advantage of the inactivity time to 
retransmit congested PDUs. 

When TAP protocol detects that the agregated PCR of 
sources exceed the service capacity C, with burst scale 
queuing, it does not not request retransmissions of PDUs. So, 
this feature is accomplished by the CCA agent which, before 
requesting retransmissions, checks that these are possible in 
order to avoid waste network resources. 



TABLE II 

SOURCE TRAFFIC DESCRIPTORS ON/OFF 



Source traffic 
descriptor 


Parameter 


Minimun 


Maximum 


Bandwidth Source 


_BS 


64 kbitVs. 


25 MbitVs. 


Cell arrival rate 


CAR 


167cells/s. 


65.105 cVs. 


Cell inter-arrival time 


1/CAR 


6 ms. 


15*15. 


Bandwidth link 


BL 


155.52 Mbps. 


622 Mbit/s. 


Cell slot rate 


CSR 


353,208 cell/s. 


1,412,648 cell/s. 


Service time per cell 


1/CSR 


2.83 )ls. 


0.70 ns. 


Active time period 


T« 


0.96 s. 


1 s. 


Mean number of celts 


x« 


160 cells 


65,105 cells 


in an active state 








Time in idle stale 




1.69 s. 


2 s. 


Mean number of 


Xoff 


596,921 cell 


2,825,296 cell 


empty slots in idle 




slots 


slots 


states 
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Fig. 5. Multiple ON/OFF sources. 

We now report some results from the simulation of the 
TAP protocol. 

Fig. 6 shows the effect of varying PCR between 86 and 
2,667 cells per second (33,000 Kbps to 1 Mbps respectively). 
In this simulation we fixed the congestion probability at 10" . 
We use an input buffer of 3,000 octets and the DMTE stores 
2 PDUs of 1 ,500 bytes for each connection. 

The value for PCR is 64 Kbps (167 cells/s.); Ton=0.96 s.; 
and Toff=1.69 s, over the 50 total PDUs discarded by 
congestion, 50 PDUs are retrieved via TAP. Also, when 
PCR=56 Kbps and 33 Kbps, TAP retrieves all the congesteds 
PDUs. Thus, the performance is optimized (50 retrieved 
PDUs out of 50 congested PDUs) since all the lost PDUs are 
retrieved and there are no DMTE failures (all the requested 
PDUs are in the DMTE). 

As we can see, when the arrival rate is low, the number of 
retrieved PDUs increases. When the PCR increases, 256 
Kbps, TAP retrieves 48 out of 50 PDUs, but the 2 lost PDUs 
are not requested because the protocol detects insufficient 
idle time (Toff) to do the retransmission. We can see how the 
number of NACKs not sent (Not requested PDUs) is greater 
when the PCR value increases. In this way, the network is not 
over-charged with useless retransmissions when there is not 
sufficient aggregate Toff. 

We note that at high PCR (1 Mbps) the number of retrieved 
PDUs is 47 and also the 3 not retrieved PDUs are not 
requested. As we can see the goodput is optimized when the 
number of trusted sources do not exceed the service capacity 
C (see Fig. 4 and 5). 




■Congestion PDUs 
S Retrieved PDUs 
■ Not requested PDUs 



PCR (Kbps) 



Fig. 7 shows the results of varying the idle time (Toff) 
between 0.1 and 2 seconds. We now use 10 ON/OFF sources 
over a bandwidth link of 25 Mbps. Each source generates 500 
PDUs at PCR=1 Mbps. We fixed a Cell Loss Rate of 5 % 
over the total emitted PDUs and we have a constant value of 
25 congested PDUs. As we can see, when the agreggated 
Toff is sufficient (0.5 seconds), all the congested PDUs are 
retrieved (25 retrieved over 25 congested). When the Toff is 
less than 0.5 s. the retrieved PDUs fall to 12 PDUs at 0.3 s., 
and 3 retrieved PDUs at 0. 1 s. So we noted that, when there is 
insufficient Toff, the number of unretrieved PDUs increases, 
but TAP guarantee the goodput since the unrecoverable 
PDUs are not requested to avoid overcharging the network. 

Another scenario consists of 1 source node, I active ATM 
switch, n non-active switches and 1 destination node. When a 
NACK arrives at a non-active switch, this also transfers the 
RM cell to the next switch. When the RM arrives at the active 
switch this uses the DMTE to retransmit the requested PDU. 
This scenario is the same as above, only the number of non- 
active switches varies. In this configuration we have 
simulated the protocol with several non-active switches and 
the results obtained show no changes. Only the delay in 
transmissions varies due to propagation times, but the index 
of retrieved PDUs is maintained as we have already shown. 

Previous work [18,19] presents a point-multipoint 
configuration consisting of 1 source node, 1 active ATM 
switch, n non-active switches and n destination nodes. This is 
equal to the above basic scenario, only the number of 
destination nodes in multipoint connections varies. At present 
we are working to achieve multipoint connections to TAP. If 
we consider the above results we can see intuitively that the 
total delay will change. Also the amount of DMTE memory 
required increases in active switches to manage the VPI/VCI 
of n connections. 

However, we shall now describe several aspects on which 
we are working to achieve better goodput. 

We will consider other source traffic descriptors such as 
SCR (Sustainable Cell Rate) and MBS (Maximum Burst 
Size). With these parameters we can characterize the traffic 
better. Also, like most applications used in the TCP protocol 
for transmission data in frame based structures, we are 
working to implement the Guaranteed Frame Rate (GFR) 
[22] service class to provide a minimum service guarantee for 




■ Retrieved 
PDUs 



-Not 
requested 
PDUs 



1.69 2 Toff (seconds) 



Fig. 6. Number of retrieved PDUs for different PCRs. 



Fig. 7. Effect of variation Toff. 
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UBR, VBR and ABR services. In order to support GFR we 
will simulate sources with a Minimum Cell Rate (MCR) 
guarantee for a given MBS and Maximum Frame Size 
(MFS). With the GFR service class TAP will guarantee that 
is able to distinguish eligible and non-eligible frames and also 
to discard cells properly. We are currently working to 
enhance the architecture including other intelligent agents to 
characterize the traffic and their class of service. 



VI. SUMMARY 

In this paper we have presented TAP as the architecture for 
an active protocol that can take advantage of suitably 
equipped active ATM switches. TAP manages a set of 
privileged VCIs to improve trusted connections when the 
switches are congested. To achieve these active 
characteristics we use AcTMs (active ATM switches) 
equipped with little support hardware and software agents of 
reduced implementation complexity. We have verified that it 
is possible to retrieve an important number of PDUs only 
with DMTE and a reasonable additional complexity of the 
AcTMs switches supported by software agents that 
implement variants of EPD to solve congestions, and WFQ to 
achieve fair allocation. The retransmission mechanism is 
based on ARQ with NACK that generates RM cells to request 
PDUs. Our simulations demonstrate that the intuitive idea of 
taking advantage of silent states in ON/OFF sources is valid. 
Thus we can achieve better goodput and QoS in ATM 
networks. 
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